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ABSTRACT 

In this paper we introduce the design and implementation of two parallel algorithms for solving the general 
tridiagonal linear system of equations using two new parallel algorithms: Parallel splitting Gauss - Jordan and the general 
parallel splitting Gauss Jordan algorithms. These algorithms are very easily distributed over different number of 
processors. In addition our implementations are very efficient, generally a linear speed-up is obtained by increasing the 
number of processors. 
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1. INTRODUCTION 

During the past few years, there has been an increasing number of parallel algorithms using many kinds of parallel 
computers to solve very large scientific problems. The main problems for designing parallel algorithms are: 

1. How to break up the problem into smaller subproblems to be treated independently. 

2. How to reorder the problem by restricting the sequence of operations to increase the amount of parallelism. 

In this paper we consider the design and implementation of two new parallel algorithms for solving tridiagonal 
systems of equations arising from the numerical solution of partial differential equation using finite difference 
approximation. 

Direct techniques: Gauss-Jordan method, WZ factorization and QR algorithms are particularly suited to parallel 
computers. In the past few years, there were some triers to solve the tridiagonal systems in parallel, see for example 

Hatzopoulos [2], who made the DWZ algorithm, but there was a restriction on n = 2 m ,OT = l, 2,... and Henk, see [3] 

who made some alternative decomposition for the tridiagonal systems of equations. In this paper we will discuss the 
sequential Gauss-Jordan algorithm and the parallel version of it done by Evans, (see [1]) to solve a general tridiagonal 
system, of equations and we will introduce two new parallel algorithms and show their advantages. 

2. SEQUENTIAL GAUSS-JORDAN ALGORITHM 

Consider the tridiagonal system of equation Ax = b (1) 

Where A is an n x n tridiagonal matrix of coefficients, x and b are vectors of order n of unknowns and right hand 
side respectively given by 




TRANS 

STELLAR 

■ Journal Publications ■ Research Consultancy 



www.tjprc.org 



editor@tjprc.org 



40 



Osama El-Giar 



b 1 
a 2 



c 1 
b 2 



c 2 



x , 
x 



d 
d 



(2) 



The sequential algorithm is given by: 
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for i = 2, 3, ... , n 

c. . , d. -a.h. . 

e. = '- , i * n , h. =— — — 

b —a s , b. —a s. , 



(3) 



and the solution can be obtained by X n —h n , X t —h { —gi x i+n l ' = n ~l> It —2 ,...,1 

In the sequential Gauss-Jordan algorithm the steps are almost sequential and we can only calculate each g, and 
in parallel see figure 1 for the sequential Gauss-Jordan with n = 5. 




Figure 1: Sequential Gauss- Jordan for n = 5 

Note that we need 10 sequential level for n = 5, i.e. in general we need 2n sequential operations for n x n system 
to be solved. 

3. PARALLEL GAUSS - JORDAN ALGORITHM 

In this section we will discuss the parallel Gauss-Jordan algorithm done by [1, J. V. Evans], the idea of this 
algorithm is that, after normalization of each element in the main diagonal, the elimination process eliminates coefficients 
both below and above it at the same time, until the original tridiagonal system is completely decoupled and has been 
transformed into the unit matrix and the solution is immediately available. The parallel Gauss- Jordan can be defined as 
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8l= 7> h '° = V 



for i =2,3,...,n 
C : 



-,i ±n, h. 



b , ~ a t8i-i 



(4) 



b i - a iSi-i 
for k = i-2, 1 

\h,i- k = h k,i- k - 1 +(-D i ~ k I7 t jl,8A, 

and the solution can be obtained by X i = h- i — n, n-1, 1 . 

The amount of parallelism is now increased and we need 5 sequential levels, for n = 5 and each level can be done 
in parallel see figure 2, the computations proceeds as follows: 

for sequential level 1 (i = 1) 

Compute in parallel, g] and h lfi . 
for sequential level 2 (i = 2) 

Compute in parallel, g 2 and h 2 ,o and h u = h l 0 - g]h 2i o- 
for sequential level 3 (i = 3) 

Compute in parallel, g 3 , h 3 0 and 
h 2 ,i = h 2 ,o~ gihzo 
hj, 2 = h u + gjg 2 h 3i0 
for sequential level 4 (i = 4) 
Compute in parallel, g 4 , h 4 0 and 
hij = h 3<0 - g 3 h 4fi 
h 2 , 2 = h 2J + g 2 g 3 h 4i0 
hi,3 = h,, 2 - gig 2 g 3 h 40 
for sequential level 5 (i = 5) 
Compute in parallel, g 5 , h 5 0 and 
h 4J = h 40 - g 4 h 5 _ 0 
h 3 ,2 = h 3J + , 
h%3 = h 2 ,2 - 
hi,4 = h u + , 
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and the solution can be obtained in parallel at the same sequential step using 
4. THE PARALLEL SPLITTING GAUSS-JORDAN ALGORITHM 

The idea of the parallel splitting Gauss-Jordan algorithm is that the nxn system is splitted into two independent 

n n n 
— X — systems and then each system can be treated independently and in parallel, but in this case we treat the first — 

n 

equations using the Gauss-Jordan algorithm 3 starting from i = 0 to i = — and treat the second system at the same time 

n 

using the backward Gauss-Jordan algorithm 9 starting from i = n backward to i= — + 1 . 

n 

Now to split the original system 2 into two independent — systems of equations, let n be even, and we have to 

th . .th. 

n I n \ 

get red from C n in the — equation and get red from a n in the I— + i I equation thus let: p x = X + 1 (5) 



hence x and X can be eliminated from the 
J +1 2 

systems will be 



f- 



th. 



equation and the 



\2 



th. 



+ 1 respectively and the two new 
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(6) 




Figure 2: Parallel Gauss Jordan for n = 5 
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which will be treated by Gauss-Jordan algorithm 2 but for i =1,2, — and in this case 

2 



X „ = // „ = 



d „ —a h 

n n n 
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(7) 



and the second system will be 
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(8) 



which will be treated using the backward gauss-Jordan algorithm defined by: 
a.. 



n 



for i - n - 1 , n - 2 , — bl 

2 



8i = 
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, — — 

b 2 b —eg 
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+1 



(9) 



and the solution can be obtained byJC = n , X ■ = fl . - g X ■ , , i = — + 2 , — + 3 , 
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(10) 



thus from equation 5, equation 7 and equation 10 we get 



(b n -a n g n )(d n - c n h n )- a n (d n -a n h„ ) 
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For example let n = 8 then the computation proceeds as follows: 
for sequential level (i = 1) 
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In parallel compute: g b h x using Gauss-Jordan algorithm and g 8 , h 8 using backward Gauss-Jordan algorithm 9. 
for sequential level 2 (i = 2) 

In parallel compute: g 2 , h 2 using Gauss-Jordan algorithm and g 7 , h 7 using backward Gauss-Jordan algorithm 9. 
for sequential level 3 (i = 3) 

In parallel compute: g 3 , h 3 using Gauss-Jordan algorithm and g 6 , h 6 using backward Gauss-Jordan algorithm 9. 

for sequential level 4 (i = 4) 

In parallel compute: /? using equation 1 1 

for sequential level 5 (/ = 5) 

In parallel compute: h 4 = x 4 , h 5 = x 5 using equation 7 and equation 5 respectively 
for sequential level 6 (i = 6) 

In parallel compute: x 3 and x 6 using Gauss-Jordan algorithm and backward Gauss-Jordan algorithm 9. 
for sequential level 7 (i = 7) 

In parallel compute: x 2 and x 7 using Gauss-Jordan algorithm and backward Gauss-Jordan algorithm 9. 
for sequential level 8 (i = 8) 

In parallel compute: x y and x s using Gauss-Jordan algorithm and backward Gauss-Jordan algorithm 9. So in this 
case the amount of parallelism has been increased and we need only n sequential steps for n unknowns, see figure 3 for the 
Parallel splitting Gauss-Jordan with n = 8. 

5. THE GENERAL PARALLEL SPLITTING GAUSS- JORDAN ALGORITHM 

The idea of the general parallel splitting Gauss-Jordan algorithm is that the n x n system is splitted into two 
n n 

independent — x — systems and then each system can be treated independently and in parallel, and at the same time the 

n 

first system is using the parallel Guass-Jordan algorithm 4 (from i = 1,2, ... , — ) and the second system is using the 

n 

parallel backward Gauss-Jordan algorithm starting from i=n backward to i= — + 1 defined by: 
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and the solution can be obtained by JC . = h 
and similar to the parallel splitting Gauss-Jordan algorithm, /? can be obtained by 



(b n -a n g n )(d n -c n h„ )-a n (d n -a n h n ) 
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(13) 



Figure 3: Parallel Splitting Gauss Jordan for n = 8 

For Example n = 8, then the computations proceeds as follows: 
for sequential level 1 (i = 1) 

Compute in parallel, gi hi ?0 , and g 8 , h 8 0 using the parallel Gauss-Jordan algorithm 4 and the backward Gauss- 
Jordan algorithm 12 respectively. 

for sequential level 2 (i = 2) 

Compute in parallel, g 2 , h 2 ,o and g 7 , h 70 using the parallel Gauss-Jordan algorithm 4 and the backward Gauss-Jordan 
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algorithm 12 respectively, and also in parallel compute 
hi,i = h l0 - g,h 2fi using algorithm 4. 
h%,\ = h sfi - g 8 /z 7 ,o using algorithm 12. 
for sequential level 3 (i = 3) 

Compute in parallel, g 3 , h 3 0 and g 6 , h 6 0 using the parallel Gauss-Jordan algorithm 4 and the backward Gauss- 
Jordan algorithm 12 respectively, and also in parallel compute 

h 2 ,i = h 2 , 0 - g2h 3 ,o using algorithm 4. 

h,, 2 = h u + gig 2 h 3 , 0 using algorithm 4. 

h?,i = h 70 - g 7 h 6>0 using algorithm 12. 

h,2 = h,i + gzgs^.ousing algorithm 12. 

for sequential level 4 (i - 4) 

Compute f$ from g 3 , g 6 , h 3>0 and h 6j0 from equation 13 
for sequential level 5 (i = 5) 

In parallel compute x 4 = h 40 , and x 5 = h 5 0 using the parallel Gauss-Jordan algorithm 4 and the backward Gauss- 
Jordan algorithm 12 respectively. 

for sequential level 6 (i =6) 

Compute in parallel 

1*3,1 = h 3i0 - g 3 h 4 fi using algorithm 4. 

h 2 , 2 = h 2 j + g 2 g 3 h 4 fi using algorithm 4. 

hi,3 = h L2 - gig 2 g 3 h 4i0 using algorithm 4. 

h-6,1 = h 6 ,o + g6g6,o using algorithm 12. 

h-7, 2 = h 71 - g(,gjh 50 using algorithm 12. 

h 8 ,3 = h,2 + gegvgghs.o using algorithm 12. 

. n 

and the solution can be obtained immediately in parallel at this step such that X t = h n l = 1,2...— using 

U T U 2 

algorithm 4 and such that 

i n 

X { =n , n i =n,n —1 — +1 using algorithm 12, see fig 4. 
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Figure 4: General Parallel Splitting Gauss Jordan for n = 8 



6. COMMENTS 



1 . In this paper we consider a general tridiagonal system of equations with order n, and in the previous two sections 
we considered n to be even, now we consider n to be odd, in this case, the only change in the parallel splitting 
Gauss-Jordan algorithm and in the general parallel splitting Gauss-Jordan algorithm that we split the systems after 



n +1 

the n and before the 

2 2 

splitting Gauss-Jordan algorithm can be obtained as: 



equations to get 2 non equal systems of equations. In this case J3 in the parallel 



( b n+i - a , 1 ^ L S, }zl )(d ll ^ - c^h^) - a^Jd^ - a^h^) 

P _ 2 2 2 2 2 2 2 2 2 2 

(b 

n+3 ' C n+3 gn + s)( d n + l - a n + l k n-l) " C n + l( d n+3 ' C n+3 h n+5 ) 



(14) 



and P in the general parallel splitting Gauss-Jordan algorithm can be obtained as: 



P _ 2 2 2 2 2 2 ' 2 2 2 2 ' 

(b n+3 -c n+3 g n+5 )(d n+1 -a n+1 h n _ ) -c n+1 (d n+3 -c n+3 h n+ ) 



(15) 



2. For the parallel splitting Gauss-Jordan algorithm if X =0 which implies that B =0 in this case 

V 1 



d„ -ah. 



n n j 



I b n -<*n8n 



(16) 
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and X = h. 



T+ 2 T+ 2 
2 2 

also for the general parallel splitting Gauss-Jordan algorithm if X —0 which implies that f3 = 0 in this case: 

— +i 

2 

d„ - a„h n 

1 - -- 1,0 

X n = h „=- h — d7) 

2 2 2 ' 

and X =h n 

"- + 2 "- + 2,0 
2 2 



3. For the parallel splitting Gauss-Jordan algorithm if X —0 which implies that 



n 

~2 

d n -c n h H 

j+l j+l j+2 

B is undefended in this case: X „ —h n —- (18) 

^+1 ^+1 b -eg 

2 2 U n +1 V n + *>L +2 

2 2 2 



and X = h. 



— / — / 
2 2 



also for the general parallel splitting Gauss-Jordan algorithm with X =0 which implies that p is undefinded 



in this case: 



d — c h 

— +1 — +1 —+2 ,0 

x n=K 7 =-r — - — (19) 

— +1 —+1,0 h —c 2 

2 2 n , ^ n o n , 

— +/ — + — +2 
2 2 2 

andX„ = //,, 

— -1 —-1 0 

2 2 



For the parallel splitting Gauss-Jordan algorithm if X —X —0 



2 2 



0 

which implies that /3 = — (undefined) then X n — "„ . X n — "„ 



0 -- 1 -- 1 - - 2 " +2 

2222 



and for general parallel splitting Gauss-Jordan algorithm if 



X n = X n , =0 

J 2 +1 
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which implies that B = — (undefined) then: 
0 



= h. 



= h. 



■ + 2 



1, 0 



+ 2,0 



7. RESULTS AND PERFORMANCE 
7.1. Results 

In this section we will examine some test problems on using the parallel splitting Gauss-Jordan algorithm and the 
general parallel splitting algorithm to deal with different types of tridiagonal systems of equations 

1. .Example 1: on using the parallel splitting Gauss- Jordan algorithm: Consider the tridiagonal system of equations 
with n = 8 (n is even) 



r 4 



4 
5 

-7 



x , 
x 



6 
9 

25 
29 
31 
-6 

K-5 , 



(20) 



= -2.5 



With exact solution Xj = i, i = l,2,...,8 
Computational Steps 

In parallel we get g t = 0.25, g s = - 1.5, hj = 1.5 and h s ■ 

In parallel we get g 2 = 0.8888889, g 7 = - 1.272727, h 2 = 4.666667 and h 7 = -0.636364 
In parallel we get g 3 = 0.521739, g 6 = -0.39759, h 3 = 5.086956 and h 6 = 4.012049 
We get J3 = 1.25. 

In parallel we get x 4 = h 4 = 4, x 5 = h s = 5. 
In parallel we get x 3 = h 3 = 3, x 6 = hg = 6 
In parallel we get x 2 = h 2 = 2, x 7 = h 7 = 7. 
8. In parallel we get x t = hj = 1, x s = h 8 = 8. 

2. Solving example 1 using the general parallel splitting Gauss- Jordan algorithm: 
• Computational Steps 

1. In parallel we get g]=0.25, g 8 =-1.5, h ] 0 =7.5 and h so = -2.5 

2. In parallel we get g2 = 0.8888889, g7 = -1.272727, h2,0 = 4.666667, h u = .333333, h 7fi = -0.636364, h SJ = - 
3.454546 

3. In parallel we get g 3 = 0.521739, g 6 = -0.39759, h 30 = 5.086956 h 6fi = 4.012049, h 2 ,i = .144928, h h2 = 1.463768, 



1. 
2. 
3. 
4. 

5. 
6. 
7. 
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h 7 ] = 4.469879 and h 8 , 2 = 4.204819 

4. We get P = 1.25. 

5. In parallel we get x 4 = h 4fi = 4, x 5 = h 5}0 = 5, 

6. In parallel we get Xi = hjj = 1, x 2 = h 22 = 2, x 3 = h 3J = 3, x 6 = h 6J = 6, x 7 = h 7}2 = 7, x s = h S3 = 8 

3. Example 2: It is example 1 (with x4 = 0), so it has the same matrix of coefficients a with the right hand side 
vector (6, 9, 9, 37, 19, 29, -6, -5) T , computational steps: 

1. In parallel we get gi = 0.25, g 8 = -1.5, hi = 1.5 and h s = -2.5 

2. In parallel we get g 2 = 0.8888889, gl = -1 .272727, /z 2 =4.666667 and h 7 = -0.636364 

3. In parallel we get g 3 = 0.521739, g 6 = -0.39759, h 3 =3 and h b = 4.012049 

4. We get J3 = undefined and hence x 4 = 0 

5. In parallel, x 5 = h 5 = 5 using equation 18 

6. In parallel we get x 3 = h 3 = 3, x 6 = h 6 = 6 

7. In parallel we get x 2 = h 2 = 2, x 7 = h-, = 7 

8. In parallel we get x^ = h\ = 1, x 8 = h% = 8 

4. Example 2: It is example 1 (with x5 = 0) so it has the same matrix of coefficients A with the right hand side 
vector (6, 9, 9, 12, 24, 44, -6, -5) T , and using the parallel splitting Gauss-Jordan algorithm Computational steps: 

1. In parallel we get gi = 0.25, g$ = -1.5, h\ = 1.5 and /z 8 = -2.5 

2. In parallel we get g 2 = 0.8888889, g 7 = -1 .272727, /i 2 =4.666667 and h 7 = -0.636364 

3. In parallel we get g 3 = 0.521739, g 6 = -0.39759, /z 3 =5.086956 and h 6 = 6 

4. We get ft = 0 and hence x 5 = 0 

5. In parallel, x 4 = h A = 4 using equation 16. 

6. In parallel we get x 3 = h 3 = 3, x 6 = = 6 
1. In parallel we get x 2 = h 2 = 2, x 7 = h 7 = 7 
8. In parallel we get X[ = hi = 1, x 8 = h% = 8 

7.2. Performance 

The two new parallel algorithms: Parallel splitting Gauss - Jordan and the general parallel splitting Gauss Jordan 
algorithms are very easily distributed over different number of processors Also our implementations are very efficient, 
generally a linear speed-up obtained by increasing the number of processors. In addition, we do not any restrictions on the 
shape of the tridiagonal system of equation, or on n. In general for n x n tridiagonal system, the parallel Gauss-Jordan 
algorithm, needs n sequential steps using n + 1 processors, and has the advantage, that all solutions can be obtained 
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immediately at the same time (in science, you can deal with n objects at the same time). The parallel splitting Gauss-Jordan 

n 

algorithm needs n sequential steps, and has the advantage that it uses only — processors, but solutions can be obtained in 

2 

parallel pairs. 

n 

The general parallel splitting Gauss-Jordan algorithm has the advantages that it only needs — + 2 sequential steps 

2 

using n processors, and also, all solutions can be obtained immediately at the same time. Also these Parallel algorithms can 
easily be extended to solve the block tridiagonal system of equations see [1]. 

8. CONCLUSIONS 

The two new parallel algorithms: Parallel splitting Gauss - Jordan and the general parallel splitting Gauss Jordan 
algorithms are very easily distributed over different number of processors Also our implementations are very efficient, 
generally a linear speed-up obtained by increasing the number of processors. In addition, we do not any restrictions on the 
shape of the tridiagonal system of equation, or on n. In general for n x n tridiagonal system, the parallel Gauss-Jordan 
algorithm, needs n sequential steps using n + 1 processors, and has the advantage, that all solutions can be obtained 
immediately at the same time (in science, you can deal with n objects at the same time). The parallel splitting Gauss-Jordan 

n 

algorithm needs n sequential steps, and has the advantage that it uses only — processors, but solutions can be obtained in 

2 

parallel pairs. 

n 

The general parallel splitting Gauss-Jordan algorithm has the advantages that it only needs — + 2 sequential steps 

2 

using n processors, and also, all solutions can be obtained immediately at the same time. Also these Parallel algorithms can 
easily be extended to solve the block tridiagonal system of equations see [1]. 
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